Robust kaomoji detection in Twitter

نویسندگان

  • Steven Bedrick
  • Russell Beckley
  • Brian Roark
  • Richard Sproat
چکیده

In this paper, we look at the problem of robust detection of a very productive class of Asian style emoticons, known as facemarks or kaomoji. We demonstrate the frequency and productivity of these sequences in social media such as Twitter. Previous approaches to detection and analysis of kaomoji have placed limits on the range of phenomena that could be detected with their method, and have looked at largely monolingual evaluation sets (e.g., Japanese blogs). We find that these emoticons occur broadly in many languages, hence our approach is language agnostic. Rather than relying on regular expressions over a predefined set of likely tokens, we build weighted context-free grammars that reward graphical affinity and symmetry within whatever symbols are used to construct the emoticon.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detection of Twitter Users' Attitudes about Flu Vaccine based on the Content and Sentiment Analysis of the Sent Tweets

Introduction: The influenza vaccine is one of the controversial challenges in today's societies. Considering the importance of using the flu vaccine in preventing the spread of influenza virus, the Twitter network, as a rich source of data, provides suitable conditions for research in this field to examine the attitudes of different people about this vaccine. The results in one hand will help h...

متن کامل

Detection of Twitter Users' Attitudes about Flu Vaccine based on the Content and Sentiment Analysis of the Sent Tweets

Introduction: The influenza vaccine is one of the controversial challenges in today's societies. Considering the importance of using the flu vaccine in preventing the spread of influenza virus, the Twitter network, as a rich source of data, provides suitable conditions for research in this field to examine the attitudes of different people about this vaccine. The results in one hand will help h...

متن کامل

Robust Sentiment Detection on Twitter from Biased and Noisy Data

In this paper, we propose an approach to automatically detect sentiments on Twitter messages (tweets) that explores some characteristics of how tweets are written and meta-information of the words that compose these messages. Moreover, we leverage sources of noisy labels as our training data. These noisy labels were provided by a few sentiment detection websites over twitter data. In our experi...

متن کامل

On-line Trend Analysis with Topic Models: \#twitter Trends Detection Topic Model Online

We present a novel topic modelling-based methodology to track emerging events in microblogs such as Twitter. Our topic model has an in-built update mechanism based on time slices and implements a dynamic vocabulary. We first show that the method is robust in detecting events using a range of datasets with injected novel events, and then demonstrate its application in identifying trending topics...

متن کامل

Identification and Robust Fault Detection of Industrial Gas Turbine Prototype Using LLNF Model

In this study, detection and identification of common faults in industrial gas turbines is investigated. We propose a model-based robust fault detection(FD) method based on multiple models. For residual generation a bank of Local Linear Neuro-Fuzzy (LLNF) models is used. Moreover, in fault detection step, a passive approach based on adaptive threshold is employed. To achieve this purpose, the a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012